sentence compression
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.07)
- North America > Canada (0.05)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America > Canada (0.04)
- Europe > Bulgaria (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (0.94)
Improving Natural Language Processing T asks with Human Gaze-Guided Neural Attention: Supplementary Material
To gain further insight into the comparison between our model and the current state of the art in sentence compression, we show results of our method and ablations in relation to ablations of the method by Zhao et al. (see Table 1). In their work, the authors added a "syntax-based Also shown is the number of model parameters. We show that our model, without additional syntactic information as was used in previous methods, still obtains SOT A performance. Figure 1: Additional paraphrase generation attention maps from our ablation study, for both sub-networks (TSM predictions and upstream task attention) in our joint architecture. TSM fixation predictions (left in blue) over epochs (last epoch is our converged models). However, we assume they do not play a role in performance between these two conditions, as these performance differences are not statistically significant.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.07)
- North America > Canada (0.05)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America > Canada (0.04)
- Europe > Bulgaria (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (0.94)
We agree with R3 that [3,37] pioneered gaze integration in NLP tasks, paving the
We thank all reviewers for their detailed and valuable feedback. In the following, we address the reviewers' comments We did not intend to claim we are the first to propose gaze integration in NLP . To the best of our knowledge, no previous works studied gaze integration for the paraphrase generation task. The authors did not reply to our requests for details on the splits. All ablations are inferior to our full model.
InstructCMP: Length Control in Sentence Compression through Instruction-based Large Language Models
Juseon-Do, null, Kwon, Jingun, Kamigaito, Hidetaka, Okumura, Manabu
Extractive summarization can produce faithful summaries but often requires additional constraints such as a desired summary length. Traditional sentence compression models do not typically consider the constraints because of their restricted model abilities, which require model modifications for coping with them. To bridge this gap, we propose Instruction-based Compression (InstructCMP), an approach to the sentence compression task that can consider the length constraint through instructions by leveraging the zero-shot task-solving abilities of Large Language Models (LLMs). For this purpose, we created new evaluation datasets by transforming traditional sentence compression datasets into an instruction format. By using the datasets, we first reveal that the current LLMs still face challenges in accurately controlling the length for a compressed text. To address this issue, we propose an approach named "length priming," that incorporates additional length information into the instructions without external resources. While the length priming effectively works in a zero-shot setting, a training dataset with the instructions would further improve the ability of length control. Thus, we additionally created a training dataset in an instruction format to fine-tune the model on it. Experimental results and analysis show that applying the length priming significantly improves performances of InstructCMP in both zero-shot and fine-tuning settings without the need of any model modifications.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- South America > Brazil (0.04)
- (22 more...)
Task-Oriented Paraphrase Analytics
Gohsen, Marcel, Hagen, Matthias, Potthast, Martin, Stein, Benno
Since paraphrasing is an ill-defined task, the term "paraphrasing" covers text transformation tasks with different characteristics. Consequently, existing paraphrasing studies have applied quite different (explicit and implicit) criteria as to when a pair of texts is to be considered a paraphrase, all of which amount to postulating a certain level of semantic or lexical similarity. In this paper, we conduct a literature review and propose a taxonomy to organize the 25 identified paraphrasing (sub-)tasks. Using classifiers trained to identify the tasks that a given paraphrasing instance fits, we find that the distributions of task-specific instances in the known paraphrase corpora vary substantially. This means that the use of these corpora, without the respective paraphrase conditions being clearly defined (which is the normal case), must lead to incomparable and misleading results.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (30 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences
Passali, Tatiana, Chatzikyriakidis, Efstathios, Andreadis, Stelios, Stavropoulos, Thanos G., Matonaki, Anastasia, Fachantidis, Anestis, Tsoumakas, Grigorios
Long sentences have been a persistent issue in written communication for many years since they make it challenging for readers to grasp the main points or follow the initial intention of the writer. This survey, conducted using the PRISMA guidelines, systematically reviews two main strategies for addressing the issue of long sentences: a) sentence compression and b) sentence splitting. An increased trend of interest in this area has been observed since 2005, with significant growth after 2017. Current research is dominated by supervised approaches for both sentence compression and splitting. Yet, there is a considerable gap in weakly and self-supervised techniques, suggesting an opportunity for further research, especially in domains with limited data. In this survey, we categorize and group the most representative methods into a comprehensive taxonomy. We also conduct a comparative evaluation analysis of these methods on common sentence compression and splitting datasets. Finally, we discuss the challenges and limitations of current methods, providing valuable insights for future research directions. This survey is meant to serve as a comprehensive resource for addressing the complexities of long sentences. We aim to enable researchers to make further advancements in the field until long sentences are no longer a barrier to effective communication.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- Oceania > Australia > New South Wales > Sydney (0.14)
- North America > United States > Washington > King County > Seattle (0.14)
- (46 more...)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- (5 more...)
Revision for Concision: A Constrained Paraphrase Generation Task
Academic writing should be concise as concise sentences better keep the readers' attention and convey meaning clearly. Writing concisely is challenging, for writers often struggle to revise their drafts. We introduce and formulate revising for concision as a natural language processing task at the sentence level. Revising for concision requires algorithms to use only necessary words to rewrite a sentence while preserving its meaning. The revised sentence should be evaluated according to its word choice, sentence structure, and organization. The revised sentence also needs to fulfil semantic retention and syntactic soundness. To aide these efforts, we curate and make available a benchmark parallel dataset that can depict revising for concision. The dataset contains 536 pairs of sentences before and after revising, and all pairs are collected from college writing centres. We also present and evaluate the approaches to this problem, which may assist researchers in this area.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- (31 more...)